Quickstart

This notebook will cover all the basic and most useful functionality available to get a user up and running as fast as possible.

Installation

Installation can be done via a pip install:

pip install remotemanager for the most recent stable version.

However if you would like the bleeding edge version, you can clone the devel branch of the git repository:

git clone --branch devel && pip install remotemanager

Function Definition

remotemanager executes user defined python functions at the location of choice. Below is a basic function example which will serve our purposes for this guide.

Important

The function must stand by itself when running, so any imports or necessary functionality should be contained within.

[1]:
def multiply(a, b):
    import time

    time.sleep(1)

    return a * b

Running Remotely

This function would run just fine on any workstation, but to run something more complex we would need to connect to some more powerful resources for this.

remotemanager provides the powerful Computer module for this purpose:

[2]:
from remotemanager import Computer

First, we must define a “template”. This is the base from which a submission script will be generated.

The easiest way to create one of these templates, is to acquire a jobscript that you know works for your machine. A few suggestions for this:

  • Machine documentation may have an example script to build from (or even a configurator!)

  • If you have already run jobs, your own scripts should suffice, otherwise a colleague may have a example for you

  • The helpdesk may be able to assist you in creating a jobscript for your use case

In this example, we will be taking an existing jobscript that we know works.

We will also parameterise just a single option, #username#. This syntax allows Computer to provide a “dynamic” input that can be changed.

The basic syntax for parameterisation is that anything between double #hashes# will be treated as a parameter and added to the computer. Here, for example, a variable called “hashes” would be created.

Important

Parameters will be sanitised to all lowercase. Therefore #ARG# == #arg#.

Important

Parameters must not clash with internal names, an error will be raised in this case. For example, we have to choose #username# here instead of #user#, since user is already an internal argument.

Note

This is covered in greater detail in the dedicated tutorial

[3]:
template = """#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=4
#SBATCH --time=00:30:00

#SBATCH --job-name=quickstart

#SBATCH --account=#username#
#SBATCH --partition=boost_usr_prod
#SBATCH --qos=normal

export OMP_NUM_THREADS=4

module load python/3.10.8--gcc--11.3.0
"""

Now, create a Computer. At a minimum you should specify:

  • Host address (or userhost=user@host)

  • The submitter that your job system uses. (defaults to bash, which will run on the login node)

[4]:
connection = Computer(
    user="user",
    host='remote.hpc.url',
    submitter="sbatch",
    template=template
)

# note that template arguments must be specified after initialisation
connection.username = "myuser"

This example connection is pointed at an imaginary user@remote.hpc.url. However, this uses your ssh configuration, so you are able to connect to a machine in the same way that you would from a command line.

For example, if there existed a machine which you connected to with ssh machine, then you are able to create a computer using:

connection = Computer("machine")

Important

Computer requires that you are able to ssh into the remote machine without any additional prompts from the remote. For connection difficulties regarding permssions, see the relevant section of the introduction.

Tip

The connection parameters inherit those from your ssh config. So if you are able to ssh <host>, you can create a Computer with Computer("<host>").

Tip

Before using Computer for the first time on a machine, any immediate problems can be discovered by testing a basic command. Start with a simple ssh user@remote "ls" and see what comes back. If the terminal returns a sensible output without prompting for a password, a Computer should function as expected.

Now we have a connection ready to go, we can see an example of the script that would be produced:

[6]:
print(connection.script())
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=4
#SBATCH --time=00:30:00

#SBATCH --job-name=quickstart

#SBATCH --account=myuser
#SBATCH --partition=boost_usr_prod
#SBATCH --qos=normal

export OMP_NUM_THREADS=4

# module load python/3.10.8--gcc--11.3.0

Remote Commands

With the concept of this remote connection, we can excecute commands and (more importantly) our function on this machine.

For commands, url provides a cmd method, which will execute any strings given

[7]:
connection.cmd('echo "this command is executed on the remote"')
[7]:
this command is executed on the remote

Running Functions

For function execution, we require a Dataset.

Note

Think a Dataset as a container for a function.

Like URL, this can be imported directly from remotemanager:

[8]:
from remotemanager import Dataset

To create a dataset, pass your function to the Dataset constructor.

Note

When passing a function to the dataset, do not call it within the assigment. For example, call Dataset(function=multiply) not Dataset(function=multiply())

Here we are additionally specifying the local_dir and the remote_dir, which tells the Dataset where to put all relevant files on the local and remote machines, respectively.

Note

We will use skip=False in the Dataset creation, otherwise the Dataset will see the dataset we previously created and import its data rather than create itself anew.

[9]:
ds = Dataset(function=multiply,
             url=connection,
             local_dir='temp_local',  # Location where files will be "staged", before sending to the remote
             remote_dir='temp_remote', # Location on the remote server where the run will be executed
             skip=False
            )

Important

This dataset has no runs, as it is just a container for the function multiply. For this, we must add runners.

Creating runs

To add runs, we use the Dataset.append_run() method. This will take the arguments in dict format, and store them for later.

You may do this in any way you see fit, the important part is to pass a dictionary which contains all ncessary arguments for the running of your function:

[10]:
runs = [[21, 2],
        [64, 8],
        [10, 7]]

for run in runs:

    a = run[0]
    b = run[1]

    arguments = {'a': a, 'b': b}

    ds.append_run(arguments=arguments)
appended run runner-0
appended run runner-1
appended run runner-2

Running and Retrieving your results

Now we have created a dataset and appended some runs, we can launch the calculations. This is done via the Dataset.run() method

Once the runs have completed, you can retrieve your results with ds.fetch_results(), and access them via ds.results once this is done

Important

fetch_results() does not return your results, but collects the files and stores them within the runners.

[11]:
ds.run()
Staging Dataset... Staged 3/3 Runners
Transferring for 3/3 Runners
Transferring 9 Files... Done
Remotely executing 3/3 Runners
[11]:
True

Wait

Calculations can take time, we can add an optional wait call here to await the dataset completion.

The first number is the check interval, the second is the maximum wait time (set to None for an indefinite wait).

[12]:
ds.wait(1, 10)

Now the run has completed, we must fetch the results before they are made available:

[13]:
ds.fetch_results()
Fetching results
Transferring 6 Files... Done

Results have been fetched from the remote, now we can access them.

[14]:
print(ds.results)
[42, 512, 70]
[15]:
ds.errors
[15]:
[None, None, None]

With this, you have all of the basic tools available to run python functions on a remote machine. See the other tutorials for more advanced usage

Warning

Be aware that on MacOS, you may receive some errors when transferring data. This is most likely due to MacOS natively using an old rsync version (<3.0.0). More information is available on this page.